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Abstract 


The geometric median, also called L^-median, is often used in robust statistics. More¬ 
over, it is more and more usual to deal with large samples taking values in high di- 
mensional spaces. In this context, a fast recursive estimator has been introduced by 
Cardot et all (120131) . This work aims at studying more precisely the asymptotic behavior 
of the estimators of the geometric median based on such non linear stochastic gradient 
algorithms. The LP rates of convergence as well as almost sure rates of convergence of 
these estimators are derived in general separable Hilbert spaces. Moreover, the optimal 
rates of convergence in quadratic mean of the averaged algorithm are also given. 


Keywords : Functional Data Analysis, Law of Large Numbers, Martingales in Hilbert 
space. Recursive Estimation, Robust Statistics, Spatial Median, Stochastic Gradient Algo¬ 
rithms. 


1 Introduction 


The geomet ric median, also called L^-median, is a generalization of the real median in¬ 
troduced by iHaldanel (Il94tj ). In the multivariate case, it is closely related to the Fermat- 
Webber's problem fsee IWebeii (|l929lh . which consists in finding a point minimizing the 
sum of distances from given points. This is a well known convex optimization problem. 
The literature is very wide on the estimation of the sol ution of this problem. One of the 
most usual method is to use Weiszfeld's algorithm (se e iKuhnI (ll973l') L or more recently, to 
use the algorithm proposed by lBeck and Sabachl (12014^. 


In the more general context of Banach spaces. iKempermanI (119871) gives many properties 
on the median, such as its existence, its uniqueness, and maybe the most important, its 
robustne ss. Because of th is last property, the median is often used in robust statistics. For 
example, iMinsken (|2014ll considers it in order to get much tighter concentration bounds 
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for aggregation of estimators. 


Cardotetal 


( 2012h propose a recursive algorithm using the 


me dian for clustering, which is f ew sensitive to outlie r s than the fe-me ans. One can also 

( 2011 ) or Gervini (2008) 


see 


Chakraborty and Chaudhuril (120141) . ICuevasI (120141) . 


Bali et al 


among others for other examples. 

In this context, several estimators of the median are proposed in the literature. In the 
multivariate case, one of the most usual method is to consider the Fermat- Webber's prob¬ 
lem g ener ated by the sample, an d to solve it using Weiszfeld's algorithm ('see lVardi and Zhang 


( 2000l) and 


Mottonen et al 


1 2010l) for example). This method is fast, but can encounter many 


difficulties when we deal with a large sample taking values in relatively high dimensional 
spaces. Indeed, since it requires to store all the data, it can be difficult or impossible to 
perform the algorithm. 

Dealing with high dimensional of functional data is more and more usual. There exists a 
Drge rec ent literature on functional data analysis Isee lBongiorno et al.l (120141) . iFerraty and Vieu 
(2006) or 
ness(see 


Silverman and Ramsavl (120051) fo r example), but few of them speak about robust- 


Cadrel (|200ll) and 


Cuevas 


120141) ). 


In this large sample and h igh dimensional context, recursive algorithms have been intro- 


duc ed by 


Cardotetal 


1 201^): stochastic gradient algorithm, or Robbins - Monro algorithm 


see Robbins and Monrol ( 1951 ). Bartoli and Del Moral ( 2001 ). Duflo ( 1997 ). 


(199C), 


Kushner and Yin 


i200.8l) 


Benveniste et al 


among others), and its averaged version 


('see lPoly~ 


and luditsky 


1 1992h ). It enables us to estimate the median in Hilbert spaces, whose dimension is not nec¬ 
essarily finite, such as functional spaces. The advantage of these algorithms is that they 
can treat all the data, can be simply up dated, and dp not re quire too much computational 
efforts. Moreover, it has b een proven inic ardot et~^ (2013) that the averaged version and 
the estimator proposed by Vardi and Zhang j200o[) have the same asymptotic distribution. 
Other properties were given, such as the strong consistency of these algorithms. Moreover, 
the optimal rate of convergence in quadratic mean of the Robbins- monro algorithm as well 


as non asymptotic confidence balls for both algorithms are given in 


Cardotetal 


izoisl). 


The aim of this work is to give new asymptotic convergence properties in order to have 
a deeper knowledge of the asymptotic behaviour of these algorithms. Optimal LP rates 
of convergence for the Robbins-Monro algorithm are given. This enables, in a first time, 
to get the optimal rate of convergence in quadratic mean of the averaged algorithm. In 
a second time, it enables us to get the LP rates of convergence. In a third time, thanks to 
these results, applying Borel-Cantelli's Lemma, we give an almost sure rate of convergence 
of t he Robbins-M onro algorithm. Finally, applying a law of large numbers for martingales 
(see lDuflol (119970 for example), we give an almost sure rate of convergence of the averaged 
algorithm. 

The paper is organized as follows. In Section |2l we recall the definition of the median 
and some important convexity properties. The Robbins-Monro algorithm and its averaged 
version are defined in Section]^ After rec alling the rate of co nvergence in quadratic mean 


of the Robbins-Monro algorithm given by 


Cardotetal 


([20151), we give the LP-rates of con¬ 


vergence of the stochastic gradient algorithm as well as the optimal rate of convergence in 
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quadratic mean of the averaged algorithm in Section |4l Finally, almost sure rates of con¬ 
vergence of the algorithms are given in Section |5l The lemma that help understanding the 
structure of the proofs are given all along the text, but the proofs are postponed in an Ap¬ 
pendix. 


2 Definitions and convexity properties 


Let H be a separable Hilbert space, we denote by (.,.) its inner product and by ||.|| the 
associated norm. Let X be a random variable taking values in H, the geometric median ni 
of X is defined by 


m := argminE [||X - h\\ - ||X||] . 


( 1 ) 


We suppose from now that the following assumptions are fulfilled: 


(Al) X is not concentrated on a straight line: for all h E H, there is h' E H such that 
{h,h') = Oand Var((X,/ 2 ')) > 0. 

(A2) X is not concentrated around single points: there is a positive constant C such that for 
all hEH, 


E 


1 

X^_ 


<C, 


\\X-hf 


< c. 


Remark that since E 


1 


Liix-/^rJ 


< C, as a particular case, E 


\\^-h\\ 


< \fC. Note that for the 


sake of simplicity, even if it means supposing C > 1, w e take C instead of y fC. Assump¬ 
tion (Al) ensures that the median m is uniquely defined dKempermanl. Il987h . Assumption 
(A2) is not restric t ive vv hene ver d > 3, where d is the dimension of H, not necessarily finite 
(see ICardot et al.l (120131) and IChaudhuril (|l992l) for more details). Note that many conver¬ 


gence results can be found without Ass um ption (A2) i f we deal with data taking values in 
compact sets (see 


Arnaudon et al 


( 20121) or 


Yang (120101) for example). 


Let G be the function we would like to minimize. It is defined for aWh E H by 

G{4);=E|||X-fc||-||X|||. 


This function is convex an d man y co nvexity properties are given in IChaudhuril (|1992l) . 


Gervinil (l2008h . 


Cardot et al 


20131) and 


Cardot et al 


(120151) . We recall two important ones: 


(PI) G is Frechet-differentiable and its gradient is given for all h E Hhy 


^{h) ■= VhG = -E 
The median m is the unique zero of O. 


X-h 


(P2) G is twice differentiable and for all h E H,Tij stands for the Hessian of G at h. Thus, 
H admits an orthonormal basis composed of eigenvectors of T;„ and let (A,-;,) be the 
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eigenvalues of F;,, we have 0 < < c. 

Moreover, for all positive constant A, there is a positive constant Ca such that for all 
h e B (0, A), Ca < ^i,h < C. 

As a particular case, let A^in be the smallest eigenvalue of F^, there is a positive con¬ 
stant Cm such that 0 < Cm < Amin < C. 


3 The algorithms 


Let Xi, X„, ... be independent random variables with the sa me law as X. We rec all the al 


gorithm for estimation of the geometric median introduced by 
as follows: 


Cardot et al 


■^n+l 


= z„ 


In 


||Z„+i — z„\ 


{ 2013h . defined 


( 2 ) 


where the initialization Zi is chosen bounded (Zi = Zi11|||Xj||<m} for example) or determin¬ 
istic. The sequence {j„) of steps is positive and verifies the following usual conditions 


E = 00 . 

n>l 


E 

n>l 


The averaged version of the algorithm fsee lPolyak and luditskyl (Il992ll . 
is given iteratively by 

— — 1 — 

Zfj+i = Z,j ^ j (Z)7 , 


Cardot et al 


( 2 ni 3 h l 

(3) 


where Z\ = Z\. This can be written as Z„ = 2 Z, 


The algorithm defined by (|2l) is a stochastic gradient or Robbins-Monro algorithm. In¬ 
deed, it can be written as follows: 


Zfj+l — Zyi -|- 'Yn^n+l/ 


(4) 


where^„+i := 0(Z,j) -|- ||x”||Zz''|| • Let be the u-algebra defined by JZ; := c^(Xi,..., X,,) = 
cr(Zi,...,Z„). Thus, {^„) is a sequence of martingale differences adapted to the filtration 
Indeed, for all n > 1, we have almost surely E [^nj^i\J^n\ = 0. Linearizing the gradi¬ 
ent, 

Zfj+l ni = (fj-f T^Fnj) (Zfj Jrz) -|- “yn^n+l 'Yn^m (3) 


with d„ := 0(Z„) — Fm ( Zn — ni). Note that there is a positive deterministic constant C„ 


such that for all n > 1 (see 


Cardot et al 


(2015!)), almost surely. 


ll^nll < C^||Z„-JHf . (6) 

Moreover, since 0(Z„) = T'm+t(z„-m)(^n — m)dt, applying convexity property (P2), one 

can check that almost surely 

||(i„|| <2 C||Z„-w2||. (7) 
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4 LP rates convergence of the algorithms 


We now consider a step sequence ( 7 ^) of the form 7 ,, = CjU “ with Cy > 0 and a G (1/2,1). 
T he optimal rate of co nvergence in quadratic mean of the Robbins-Monro algorithm is given 


m 


Cardot et al. 


for all n > 1 , 


( 2ni5h . Indeed, it was proven that there are positive constants c', C such that 

C' 


< E 


\Zn - m\ 


< 


( 8 ) 


Moreover, the U rates of convergence were not given, but it was proven that the p-th mo¬ 
ments are bounded for all integer p: there exists a positive constant Mp such that for all 
n > 1, 


E 


|Z„ -mfP 


< Mp. 


(9) 


4.1 Lf rates of convergence of the Robbins-Monro algorithm 


Theorem 4.1. Assume (Al) and (Al) hold. For all p > 1, there is a positive constant Kp such that 
for all n> 1, 

r . n V 

( 10 ) 


E 


|Z„ - m 


| 2 p 


< 


npoi 


As a corollary, applying Cauchy-Schwarz's inequality, for all p > 1 and for all n> 1, 


E [\\Zn - m\f] < ^ 

n 2 


( 11 ) 


The proof is given in Appendix. Since it was proven (see 


Cardot et al 


( 201511 1 that the 


rate for p = 1 is the optimal one, one can check, applying Holder's inequality, that the given 
ones for p > 2 are also optimal. In order to prove this theorem with a strong induction on 
p and n, we have to introduce two technical lemma. The first one gives an upper bound 


for E 


Zn+i m 


| 2 P 


when inequality dTOl) is verified for all /c < p — 1, i.e when the strong 
induction assumptions are verified. 

Lemma 4.1. Assume (Al) and (A2) hold, let p >2, if inequality (IIOD is verified for all integer for 
0 to p — 1, there are a rank n^ and non-negative constants Cq, Ci, C 2 such that for all n > n^. 

Cl 


E 


Z„+i - m 


| 2 p 


< (1 - Co7„)E 


\Zn — m 


|2p 


+ 


,{p+l)a 


C27„E 


\Zn — m 


|2p+2 


• ( 12 ) 


Z„+i - m 


The proof is given in Appendix. The following lemma gives an upper bound of E 
when inequality flOl) is verified for all k < p — 1, i.e when the strong induction assumptions 
are verified. 

Lemma 4.2. Assume (Al) and (A2) hold, let p >2, if inequality (lIOl ) is verified for all integer from 
0 to p — 1, there are a rank n^ and non-negative constants C[, €'2 such that for all n > n^. 


|2p+2 


E 


Z„+i - m 


|2p+2 


n ) 




C 

'^1 


— m 


|2p 


( 13 ) 
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The proof is given in Appendix. Note that for the sake of simplicity, we denote by the 
same way the ranks in Lemma l4Tl and Lemma lT2l 

4.2 Optimal rate of convergence in quadratic mean and rates of converge of 
the averaged algorithm 


As done in 


Cardot et al 


1 2013!) and IPelletieii (I 2 OOC 1 ) . summing equalities © and applying 


Abel's transform, we get 


nTm {Zn - m) = — - 


T'n+l 


7l 


^2 V7ic lk-1 


£4+E W 


(14) 


k=l 


k=l 


with Tj. := Zjt — m. Using this decomposition and Theorem 14.11 we can derive the U rates 
of convergence of the averaged algorithm. 

Theorem 4.2. Assume (Al) and (A2) hold, for all integer p > 1, there is a positive constant Ap 
such that for all n> 1, 


E 


Z„ — m 


| 2 p' 


< 


Ap 


(15) 


The proof is given in Appendix. It heavily relies on Theorem 14.11 and on the following 
lemma which gives a bound of the p-th moments of the sum of (non necessarily indepen¬ 
dent) random variables. Note that this is probably not a new result but we were not able to 
find a proof in a published reference. 

Lemma 4.3. Let Yi,..., ¥„ be random variables taking values in a normed vector space such that for 
all positive constant q and for all k > hE[||Y,f] < 00 . Thus, for all constants ai,...,a„ and for 
all integer p. 


E 


k=l 


< EKi(E[iinr])^ 

\/c=l 


(16) 


The proof is given in Appendix. Finally, the following proposition ensures that the rate 
of convergence in quadratic mean given by Theorem l4.2l is the optimal one. 

Proposition 4.1. Assume (Al) and (A2) hold, there is a positive constant c such that for all n > 1, 

1 21 


E 


Z„ — m 


> -. 
n 


Note that applying Holder's inequality, previous proposition also ensures that the LP 
rates of convergence given by Theorem l4.2l are the optimal ones. 


5 Almost sure rates of convergence 


It is proven in ICardot et al.l (|2013ll that the Robbins-Monro algorithm converges almost 
surely to the geometric median. A direct application of Theorem 14.11 and Borel-Cantelli's 
lemma gives the following rates of convergence. 
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Theorem 5.1. Assume (Al) and (A2) hold, for all f> < cl, 


\Zn m\\ - 0 ^ 


a.s. 


(17) 


The proof is given in a Appendix. As a corollary, using decomposition (fl^ and Theo¬ 
rem |5dJ we get the following bound of the rate of convergence of the averaged algorithm: 

Corollary 5.1. Assume (Al) and (A2) hold, for all <5 > 0, 


\Z„ — m\\ = 0 


(Inn) 


L+S ' 


a.s. 


(18) 


The proof is given in Appendix. 
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A Appendix 

A.l Proofs of Section |4IT] 


First we recall some technical inequalities (see lPetrovI (Il995h for example). 
Lemma A.l. Let a, b, c be positive constants. Thus, 


fl" b^c 


ab < -h 

- 2c 2 


^ c fl 
a ^ —. 

- 2 2c 


Moreover let k, p be positive integers and ai ,..., flp be positive constants. Thus, 

/ „ \ k 




vAi 


/=i 


”1 


Proof of Lemma l4)2l We suppose from now that for all k < p — 1, there is a positive constant 
such that for all n > 1, 


E 


\Z„ — m 


,2k 


Kk 


■yikoc 


(19) 


Using decomposition (0]) and Cauchy-Schwarz's inequality, since by definition of we 
have||^„+i||-2(0(Z„),^„+i) <1, 

\\Z„+i - m\f- = \\Z„ - m- jn<^iZn)f + + 2 j„ {Z„ -m- 7„^(Z„),^„+i) 

< \\Z„ - m- 7„0(Z„)||^ -F27„ (Z„ - m,^n+i) ■ 


1 










2 

Let V„ := \\Zn — m — 7„0(Z„)|| . Using previous inequality, 

||Zn+i - < {V„ + 72 + 27„ - m)Y^" 

= {Vn + + 2(p + l)7n (?n+l,Z„ - m) (U,, + 7 ?/ (20) 

+ E t Z„ - ?n) (y„ + 7^) . (21) 

We shall upper bound the three terms in (l20l) and (|2T1) . Applying Cauchy-Schwarz's in¬ 
equality and since almost surely ||0(Z„) || < C\\Z„ — m\\, 

V„ = \\Zn - mf - 2j„ {Z„ - m,0(Z„)) + Yn ll^(Zn)f 

< ||Z„ -mY + 2Cjn ||Z„ -mY + l|Zn - mf 

< (l + c^C)^||Z„-mf. (22) 


We now bound the expectation of the first term in (l20l) . Indeed, 


E 


{Vn+Y) 


2NP+1 


= E 


< E 


>'jj 


p+i 


p-i 


+ (P + 1)7^E [Uf] + E 


k=0 


P + l\^2(p+l-k)^ 


U'' 

^ M 


V, 


P+1 


+ (P + 1)(1 + c^C)^P 72 e \\\Z„-m\Y 




k=0 


{1 + CyCYYI^^^ 


\Zn — m 


\lk 


Applying inequality (O, 


p-i 

E 

k=0 


p + 1 
k 


p-i 


(1 + *^^E [||Z„-mf'^j < 


k^O 


p +1 
Ic 




< (1 + 


k=0 

= o 


k 

1 


^{2p+2-k)a 


^n{p+3)a^ 

As a conclusion, there is a non-negative constant Ai such that for all n > 1, 


E 


{Vn + Yn) 


2\P+1 


< E 




p+i 


+ (p + l) {l + CyCf^Yn^ \\\Zn-nt\Y 


At 


n(P+3)“’ 


(23) 


We now bound the second term in (l20l) . Using the facts that (^n+i) is a sequence of martin¬ 
gale differences adapted to the filtration {J^„) and that Z„ is -measurable. 


E 


2\P 


2(p + l)7n {^n+l, Zn - m) {V„ + Y) 


= 0 . 


(24) 


Finally, we bound the last term in (l2T]) . denoted by (*). Since almost surely ||^„|| < 2, 
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applying Cauchy-Schwarz's inequality. 


(.) < E ' E ‘ ('’ + ’)('’ + ] 2‘7?+‘ ll?,+ili l|z, - vr 

£ E ' E * ('’ D r 0 ^ 

Since almost surely Vn < {1 + c^C)^ \\Z„ — m\\^ (see inequality (ISSil l. 

p+l p+l-<: 


( 25 ) 


(*) < E E ('’ D ^ ~ (1 + l|z„ - 

= E " E " t ^ ~ (1 + C7C)^'-+^-“-"( ||Z, - (26) 

+ E ^ ^)z“7j (1 + ||Z, - m|pP+^-‘ + 2 ') “ "'11^'’ 


We bounci the expectation of the two first terms on the right-hand side of (l26l) . For the first 
one, applying Cauchy-Schwarz's inequality. 


E 


E’’E * ('’ r) ^ ||Z„ - 

5fer)r;-‘ 


2p+2~2k~2j 


E \\Zn-mf^P~’^ E \\\Z„ - 


Applying inequality (O, 

■p+l p+l-J: 


E 




k=2 j=l 
p+l p+l~k 


p=2 7=1 V'^/Vy/ n 2 n 2 


= 0 


n(P+2)® 7 ’ 

Similarly, for the second term on the right-hand side of (|2^ . applying Cauchy-Schwarz's 
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inequality, let 


P+i 

(**) := E 

k=3 

p+1 

<E 

k=A 


P + 1 


p + 1 
k 


(1 + \\Zn - mf 

22*^7^ + [||Z„ 


. + 64 

p+1 

<E 

k=A 

+ 64 


+ (1 + c^CfP-^E 


I Z„ — m 


i2p-l 


p + 1 
k 

p + 1 
3 


22*^7^ (1 + ^E [||Z„ - mf 

(l + c.^C)^P^^E [||Z„ 


E 


|Z„ — m 


|2(P-1) 


Applying Lemma [A.ll and inequality (O 

|2^''7n(l + C7C) 


p+1 

(**) < E 

k=A 


P + n'^k^X n I „ r"^2p+2-2^: -\/^p+3-fc-^p-l 

„(p+l-/c/2)a 


+ 32 
= O 


p + 1 


{1 + c^Cf^^^^l (e f||Z„-?iifP 


■E 


yi{p+2)a 


32 


p + 1 
3 


{1 + CyCff’SlE 


\Z„ — m 


Z„-mfP-^ 

2p 


Finally, let us denote by (* * *) the expectation of the term in (l2T]l . there is a positive constant 
A 2 such that for all n > 1, 


A 2 


(* * *) < ——— + 


16 


p + 1 
2 


(1 + c^C) -(lE 


|Z„ — ?u 


|2P' 


+ 32 


p + 1 
3 


(l + c^CfP-S^E 

(27) 

Applying inequalities (I23l).(l24l) and (1^ . there are positive constants C", C2 such that for all 

n > 1, 

yP+l] _l_ _E_ 7 r'o^ 2 |n [11 7 _j^|| 2 p 


E 


Z„+i - m 


|2p+2 


< E 


(28) 


In order to conclude, we need to bound E 


12^ , 


V. 


P+1 


Applying Lemma 5.2 in 


Cardot et al 


there are a positive constant c and a rank ria such that for all n > tig^, 


E 




< 1 - 


2 \ 


p+1 


E 


\Zn - m 


|2p+2' 


(29) 


Finally, since there is a positive constant Cq such that almost surely ||Z„ — m|| < Cqh} “ and 
















































since almost surely V„ < {1 + c^C)^ \\Z„ — 


E 


^n^^'^{\\Zn-m\\>cn^-‘‘} < (1 + C^C)^P^^E || Z„ - m l|||2„_m||>cni-“} 


<r (T 4-r r^P^^M(2p+2)(l—a)-[p ['ll 

Sli + c-yL] Cq n'r |^I|||2„_„,||>c„i-<«} 

= (l + C'yC)^P+^Co^+^n(2p+2)(i-^)p l^\\z„-m\\ > . 

Applying inequality @ and Markov's inequality. 





\\Zn-mf^ 

E 

T/P+l-n 

^{\\Z„~m\\>cn^^'^} 

< (l + c,^C)^P+^CoP+^n(2p+2)(i-«) 

(cj7)2i?( 1 a) 


< 


(1 + c2p+2^(2p+2)(l-a:) 


= o 


c2q{l-a) 

1 


„2(j(l-a) 


„2<j(l-a)-(2p+2)(l-«) 


Taking q>p + l + 


E 


T/P+^11 

^{||Z„-m||>cni-“} 


= o 


(30) 


n{p+2)a ) ■ 

Finally, using inequalities (l28l) to dSOll . there is a positive constant C[ such that for all n > n, 


E 


Zn+i-mf^+^ 


< 1 - 


2 \ 


E 




+ 


r' 

'“1 




n{p+2)oi 


1Z„ - wfP 


(31) 

□ 


Proo f o f Lemma |4T] Since the eigenvalues of r,„ belong to [Amm, C], there are a rank and 
a positive constant c' such that for all n > ria, we have \\Ih — Jnf'mWop < 1 “ f^min'yn and 
0 < (1 — Amin7n)^ + 4C^7^ < 1 — c'7„. Using decomposition © and Cauchy-Schwarz's 
inequality, since ||(i„|| < 2C ||Z„ — m\\ and ||^„+i||^ — 2 (0(Z,j),^„+i) < 1, we have for all 
n > Hoc, 


•Zfj+l rn\\ < (1 Aini2i7n) \\^n E 27 ;i(^n+l/Zfj Ttl 7«*1’('Zh)) 

- 27„ ((Ih - InTm) iZ„ - m),5„) +jl \\^n\\^ + ll ||^«+lf 
< (1 - c'jn) \\z„ - mf + 2jn \\z„ - m\\ ||4|| +7^ + 27„ (Z„ - m,^n+i) ■ 

(32) 


Thus, for all integers p > 1 and n > n^, 


E 


Z„+i - m 


|2p 


< (1 - c'jn) E 


|Z,j Tn\\ llZfi^i m 


\ip-i 


+ 27nE ||Zh wj|| ll^inll ||Z,;^i ni 


|2p-2 


+ 7nE 


Z„+1 - m 


|2p-2 


+ 27 nlE {Zyi rn,^n+i) ||Z,;^i rn 


|2p-2 


(33) 
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In order to bound each term in previous inequality, we give a new upper bound of 11 — ni\\ 

By convexity of G, we have almost surely V„ < ||Z„ — m||^ + 7 ^, and inequality (l20l) can be 
written as 

\\Zn+i - mf P-2 < (|1Z„ - nif + - 1 ) 7 ,, (^„+i,Z„ - m) {\\Z„ - mf + 7 ?,)^^^ 

+ E (y I |27n Z„ - m) 1*^ (||Z„ -mf + jl) . 


Applying Cauchy-Schwarz's inequality, since ||^n+i|| < 2, 



Note that if p < 2, the last term on the right-hand side of previous inequality is equal to 0. 
Applying previous inequality, we can now bound each term in inequality (l33l) . 


Step 1: Bounding (1 — c'7„) E \\Z„ — m\\^ \\Zn+i — ni\\^^ 2 _ 

We will bound each term which appears when we multiply (1 — c'j„) || Z„ — m ||2 by the 
bound given by inequality (l34l) . First, applying inequalities (fT^ . 


E 


(1 - c'jn) \\Z„ - m 
= (1 - c' 7 „) E 
< (1 - c' 7 „) E 


Z„ — ni 


Zn-m\\ +7„ 

P-2 /p - 1 


p-i 


|2p 


Z„-m|| 2 P 


+ E 

k=0 

P -2 

+ E 

k=0 


(1-C'7n)7n^^ ' 


\Zn — m 


\2k+2 


p -1 


(1 - CW f c2(p-l-P) ^k+l 

^(ip-i-k)oc- 


Since for ail ic < p — 2, we have 2p — 1 — A:>p + 1, there is a positive constant such that 
for ail n > n^, 


E 


{l-c'^n)\\Zn-mf(\\Zn-mf + ^l) < (l - c' 7 „) E \\Zn- 


p-l 


m 


|2p 


Bi 


n(P+i)a' 

(35) 

Moreover, using the facts that (^„) is a martingale differences sequence adapted to the fil¬ 
tration {Tn), and that Z,, is -measurable. 


E 


(1 - c' 7 „) \\Zn - m||2 2 (p - 1) 7 „ (^„+i, Zn-m)( ||Z„ -m\\^ + 7 ^ 


P-2' 


= 0. (36) 


We can now suppose that p > 3, since otherwise the last term in inequality (l34l) is equal to 
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0. Let 


w := (1 - c'jn) E 
< (1 - c'7„) ^ 

k=2 


\\Zn-m/£ 


2 v- / P - 117 ^ll^PlIv ...l |2 , „. 2 ^P ^ ^ 


k=2 

2 P- 2 +;c^^ ( E 


2^^j^„\\Zn-m\n\\Zn-mr + Yn 


P \ r^p-2+k^k 


\Z„ — m 


,2p-k 


+ 7n^^ ^ ^'^E 


\Zn — m 


\k+2 


Applying Cauchy-Schwarz's inequality. 




k^2 


-2+k k 
In 


'E E \\\Zn - 


+ ^ '^\/E 


1Z„ — m 


\2k 


E 


\\Zn - m\ 


Finally, applying inequality (O, 


i fP aP-2+)c„,;t ( y/t^p-l^p+l-k ^2{p-'i-k) 


k=2 




(fc+2)a I 
n 2 / 


= O 


j(p+l)a y ' 


(37) 


because for all 2 < P < p — 1 and p > 3, we have p + k/2> p + 1 and 2p — — 1 > p + 1. 

Thus, there is a positive constant B[ such that for all n > tia, 


E 


(1 - c' 7 „) ||Z„ - mf ||Z„+i - mf < (l - c' 7 „) E [||Z„ - mf P 




,(p+l)a 


• (38) 


Step 2: Bounding 27 „E (^„+i,Z„ - m) ||Z„+i - . 

Applying the fact that (^„) is a martingale differences sequence adapted to the filtration 
{Tn) and applying inequality dMll , let 


(**) := E 2^n {(^n+i,^n “ m) ||Z „+1 - m||^P ^ 

< 4(p - 1 ) 72 e [ (^„+i,Z„ - m)^ (||Z„ -mf + 7 ^) 


P-2 


Since ||^n+i|| < 2 and applying Cauchy-Schwarz's inequality. 


(**) < 4(p - 1)7«E 


(||^„+i|| \\Zn-m\\Y[\\Zn-mf + ^l 


P-2' 


< 16(p - 1 ) 7 ?,E 


||Z„-mf (||Z„-mf+ 7 ^ 


P-2 
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With the help of Lemma [A. 1[ 

(**) < 2P+2(p - 1)^1 (e 


\Zn — m 


|2(P-1) 




\Z„ - ml 


Applying previous inequality and inequality (fT^ . there is a positive constant B 2 such that 

B' 


E 


27n - m) ||Z „+1 - ^ 


< 


n{p+i)oc' 


(39) 


Step 3: Bounding ThE 

Applying inequality (IT9l) . 

7nE 


Zn+l - m 


|2p-2 


\Zn+i-m\\^^^^ 


< In 

= O 


2 ^p-1 


(n + l)P-i 

1 


j(p+i)“ / ■ 


(40) 


Step 4: Bounding 27 „E ||Z„ — m|| ||(i,j|| ||Z„+i — . 

As in step 1, we will bound each term which appears when we multiply 27 ^ 1 1 Z„ — m 1 1 \\3„ 
by the bound given by inequality (l34l) . Since almost surely ||^„|1 < 2C ||Z„ — m||, applying 
inequality (l37l) , one can check 


27 „E 


Zn-m\\ ||^„|| ^ 2^^^ ||Z„ - m\f (||Z„ - mf + jI) 


p—l—k 


< 4C7„E 




k=2 


= 0 


n(P+i)“^ 

Moreover, since (^„) is a martingale differences sequence adapted to the filtration 


E 


2%r\\Z„-m\\ ||<i„|| 2 (p-l) 7 „(^„+i,Z„-?u) [\\Z„-m\\ + 7 ,, 


|2 I ^ 


= 0 . 


Finally, since almost surely ||ci„|| < Cm\\Zn — m\\^ and ||^„|| < 2C\\Z„ — m\\, applying 
Lemma lA.il 


(★**) := E 


2jn\\Zn - m\\\\3n\\ {\\Zn - m\\ + 


|2 , „2^P‘^ 


<2P-VE \\\Z„ - mfp-^ \\Sn\\] +2P-V/ ^E[||z„-m|| ||^„ 


< 2P-1C^7„E 


\Zn — m 


|2p+l 


■2P-^C%P ^E 


\Zn - ml 


Applying Lemma [A.ll 
1 

(★**)< -c'7„E 


\Zn — m 


|2P 


+ 22p-2^7„E [||Z„ - + O ^ ^ 


(p+l)a 
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Thus, there are positive constants Bg, such that 


2E 


\Z„-m\\ ||^„|| ||Z„+i-mfP-2l < 1 c'7„E kz,, - m\M 


\Zn — m 


|2p+2 


B', 


n(P+i)“' 

(41) 


Step 5: Conclusion. Taking Cq = ^c', applying inequalities 
are positive constants Ci, C 2 such that for all n > ria, 


E 


Z„+i-mfP <(1-Co7„)E 


|Z„ -mfP 


Cl 


n(P+i)"= 


C27„E 


, (l40l) and (l4T]| . there 


□ 


Proof of Theorem l4Tl We prove with the help of a complete induction that for all p > 1, and 

P P' 


for all f> G (a, “ p)/ there are positive constants fCp, such that for all n > 1, 


E 


\Zn - m 


|2p 


< 


nP“' 


E 


\Zn — m 


|2p+2 


< 


C 


f’'p 


n^P ■ 


This result is proven in 


Cardot et al 


(I 2 OI 5 I') for p = 1. Let p > 2 and let us suppose from 


now that for all integer fc < p — 1, there are positive constant fCjt such that for all ft > 1, 

Kk 


E 


\Zn - m 


12k 


— Y^ka ' 


(42) 


We now split the end of the proof into two steps. 

Step 1: Calibration of the constants. 

In order to simplify the demonstration thereafter, we introduce some constants and no¬ 
tations. Let j6 be a constant such that — 1 > j6 > a and let K'^, K'^ ^ be constants such 
that JCp > 2^+P‘^CiCg (Cl is defined in Lemma l4d)) . and 2fCp > K'^ ^ > K'^ > 1. By 
definition of f, there is a rank rip^^ > {n^ is defined in Lemma [4.11 and in Lemma [4.211 

such that for all n > rip^^, 


(l-Co7n)(^^j +2CoJn + - - —rTTTTTT^ < ^ 


(^ 7 ;[)a+(p-a)p 


n) 


(Cl + ‘“24) ^ -^fp+iy-pfi 


< E 


(43) 


with C 2 defined in Lemma [4T] and C[, C 2 are defined in Lemma [4^ Because f> > a, 


(1 - co7„) 


+ 1\P“ 1 


2«+pp+ic C 2 

+ („ + l).+ (/3-.)p - 1 ^ 


+ 2^07n + O 




= 1 - f::Co7n + 0 — . 
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In the same way, since /3 < ^ pj6 < 2p + 2 and 




- a = 1 - (2p + 2 - pB) - + 0 

Oi-pf, ^ c n \ n 


P — “p,p^ 


Step 2: The induction. 

Let us take K'^ > 
prove by induction that for all n > 

K' 


2p 


and 


- m 


2p+2 


, we will 


E 


\\Zn-mfP 


< 


npct' 


E 




K' 


< 




nP^ 


Applying Lemma iLll and by induction, since IK^ > 


E 


Z„+i - m 


|2p 


< (1 - Co7n)E \\\Z„ - mf^ 


+ 


Cl 


n(P+i)“ 


+ C27nE \\\Zn-mfP+^ 


K Cl K e 

^ (1 - C07n) ^ + C2ln-^ 


K' 

< (1 - co7„) + 


Cl 


nP“ n(P+b« 


K' 

+ 2C27.^. 


K' 

Factorizing by 


E 


\\Zn+i - mf^] < (1 - Co7„) 


n +1 


pa. 


K' 


7 2C-yC2 


n +1\ 


n 

a+fip 


(n + l)P“ 


+ 


n + 1 


pa 


Cl 


(n + l)P“n“ 




7 (n + l)/5p+'>: 


< (1 - Co7n) 


n + 1 


pa 


K' 


+ 


2P“CiC.. 7« 2'^+^P+ic-yC2 


+ 


K' 


(n + l)P“ (n + l)P“ (n + l)‘'+(/5-“)P (n+ !)?''■ 


Since fC|, > 2i+P'‘CiC-ico \ 


E 


\\Zn+l-m\M < (1-C07n) 


n +1 


pa 


K' 


K' 


2«+/ip+ic C2 


< 


(1 -co7„) 


(n + 1)P“ 2^”^° (n + 1)P“= 

n + l\P“ 1 2 “+/^p+%C2 \ 

+ 2^07^ + I („ + i)pa:- 




(„ + l)a+(p-«;)p (n + l)P'^ 
K' 


By definition of rip^^ (see (l43l)). 


E 


ll2„+i-mfP 


< 


K' 


(n + l)P'*' 


(44) 


In the same way, applying Lemma 14.21 and by induction, since K'^ ^ > 1, for all 
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n > 


E 


Z„+i - m 


|2p+2' 




\Zn — m 


|2p+2 


a 


,(p+2)a 


ChnE 


|Z„ — m 


|2p' 


-\ n) nPf^ niP+^> ^ '^”nP‘» 


-"-n 


o \ P+1 K' C K' K' 


nP^ 


j_ " _L 

+ „(p+2)« + ‘-27n 


nP"= 


Factorizing by 


E 


Zn+i - m 


|2p+2 


2\ 


P+1 




+ iY^ ^p,/s + 


+ C2C^ 


n J (n + l)P^ 

^^l^(p+2)<i 1 


c 


(p+2)a 


K' 


P/^ 


y (n + l)(P+2)'^-P/5 (n + l)P/® 


K' 


P,fi 


(n + l)(P+2)"=-p/5 (n + l)P/^ 


< 


^ J 


2\P+^ /n + l\P^ 


2(p+2)fl 


C[ + qci 


'^2 


K' 


p,f> 


(n + l)(P+2)'>:-p/iy (n + 1)P^' 


By definition of 


E 


Z^+i - m\Y+^ 


K' 


< 


p,f> 


(45) 


(n + l)P/5' 

which concludes the induction. In order to conclude the proof, we just have to take Kp > K'p, Kp^^ > 
and 


Kp > max n'^^E 

P<npfi 




f^p,B > max n^^lE 

P<wp,p 


IZp - m 


|2p+2 


□ 


A.2 Proofs of Section I4l2l 

Proof of Lemma l4Jl For all integers p > 1 and n > 1, there are positive constants c^, b G N", 
such that for all non-negative real numbers yp, k = I, ...,n, 

= E Cby\K..yl"- (46) 

\P=1 y b={bi,...,b„)eK'',bi+b2+-+Pn=P 
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As a p articular case, applying a classical generalization of Holder's inequality (see 


m 


Smarandache 


( 1996h . page 179, for example), 


E 


k=l 


< E 


EKIIlnll 

Kk=l 


b={bi,...,b„)eK’',bi+b2+---+K=p 

< Y ... (E [IlYlf])^ ... (E [llrnf])'" 

b={bi,...,b„)eK",bi+b2+...+b„=p 

= Y ^b (|«i| (E [||Yir])M''... i\an\ (E [||Y„r])^''" 


1^1 IIY 11^^^ 


= EKK^ill^^cf])^ 

\k=l 


□ 


The following lemma give the LP rates of convergence of the martingale term. Note 
that this is probably not a new result, but we were not able to find a proof in a published 
reference. 


Lemma A.2. Let (^„) he a sequence of martingale differences taking values in a Hilbert space H 
adapted to a filtration [Fn]- Suppose that there is a non-negative constant M such that for all n > 1, 
ll^nll < M almost surely. Then, for all integer p > 1, there is a positive constant Cp such that for 
all n >1, 



n 

2p- 


L^k 



k^2 



< CpuP. 


Proof of Lemma A.2. We prove Lemma A.2 with the help of a strong induction on p > 1. 
First, if p = 1, since is a sequence of martingale adapted to a filtration 




n 

2" 

E 


Y^k 




k=2 



= Ee + 2 E 

k=2 k=2k'=k 


< (n-l)M2 + 2E Y^ii^k^^i^k'lJ^k'-i])] 

k=2 k'=k 

= {n — 1)M^. 


Let p > 2 and for all n > 2, M„ := Ylk =2 ^k- We suppose from now that for all fc < p — 1, 
there is a positive constant Cp such that for all n >2, 


E 


\\Mn\f <Cp{n 


1)K 
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For all n > 2, 


Thus, 


\\Mn+lf = \\M„f + l{Mn,^n+l) + Un+lf 
< \\Mnf + 2{Mn,^n+l)+M^. 


||M„+ifP< (||M„f+ M2) +2 (M,„^„+i) (||M,f+ m2 
+ E (fc) |2 . 


p-1 


(47) 


We now bound the expectation of the three terms on the right-hand side of previous in¬ 
equality. First, by induction. 


E 


,,9 

M„ r+ M 


= E 


< E 


< E 


k=\ 

V 




P \ A/r2*:Ti 

k. 

p\ Tl Aik ( 


||M„fP] +o(nP-i) . 


Moreover, since is a sequence of martingale differences adapted to a filtration (J^n), 
and since M„ is -measurable. 


E 


{Mn,^n+l)[\\Mnf + M 


p-1' 


= E 


(M„,E[^„+i|J-„]) ||M„f+ M 


,7 

K I A /r2 ' 


= 0. (48) 


Finally, applying Cauchy-Schwarz's inequality and Lemma A.l, since ||^„|| < M, let 


(*) •= E [|2(M„,^„+i)|'^(||M„||2 + m2) 


p—k 


k=2 

V 


\\Mnf [\\Mnf + ^ 

< (^^2P-^M’'(e ||M„f ). 


Applying Cauchy-Schwarz's inequality and by induction. 


k=2 

P 


(*)<E H^p-imM^/e 


\Mn 


\2p-2 


E 


\\Mn 


|2(p+l-J:) 


+ m2p-2^/e 


||M,f E \\Mn 


^ E (^)2P~1m'^ (^Cp-iCp+i^pnP-'^/^ ^ M2p-2Pv/CiC,_in'^/2^ 

= O (nP-i), 


(49) 


2 fc -2 
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since p > 2. Thus, thanks to inequalities (l47l) to di^ , there is a non-negative constant Ap 
such that for all n > 1, 


E 




■! + l| 


< E 


\Mn\\ 


Ip 


+ ApnP-^ 




k=2 


< + ApfiP, 


which concludes the induction and the proof. 

Proof of Theorem 4.2. Let us recall the following decomposition 


{fZ.fl mf 


h 

7i 


Tfj+i 

Tn 


n 

+ LTk 


k=2 




+ £4 TEW 

ic=l k=l 


□ 


(50) 


with Tn = Zfi — m. Let Amm > 0 be the smallest eigenvalue of Tmr we have with Lemma A.l, 


E 


Z„ — m 


|2p' 


52 P -1 

< ^-E 

A n^P 

mm 

52 P-I 

+ ^-E 

AJ’- n^P 


Ti 

2p- 

52P-1 r 

1 E 

Tn+l 

2p- 

7i 


1 '-1 il-i 

A 

min L 

Ifn 



52p- 




E 


LTk i: 


k=2 


Ik 


n 

2p- 

52 P-I 


n 

2p- 

E4 

k=l 


+ , E 

A 

mm 


E4+1 

k=l 



We now bound each term at the right-hand side of previous inequality. Since Z\ is almost 

2 p] 

Moreover, with Theorem 4.1, 


surely bounded, we have 




J_ \ \\T„+ifP 
n^P [ 


- (fP n^p-2pi>c (^pi lya 



(51) 


since a < 1. In the same way, since 
Theorem 4.1, 

1 


_ i_ 

7/c-i 7k 


< 2ac^ applying Lemma 4.3 and 


n^P 


E 




k=2 


7k 7k-\ 


2p 


< 






a=2 


1 \ 2p 

Kf,\-v\ 


< 


2^Poi^Pcj^^Kp ( fT 1 


2p 


= O 


n^P 

1 


E 

a=2 


;tl-a/2 


,( 2 -a)p y ■ 


(52) 


1 

7k-l 
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2 

Finally, since ||^n|| < ||Z„ — m|| , applying Lemma 4.3 and Theorem 4.1, 


„2p 


E 



n 

Ip- 


E4 



k=l 



2p 


< 


n^P 




VJc=l 

C^P / " / 


|2p 


2P 


a=i 




r 


\ 2p 




~ n^P A:“ y 

1 


2p 


= o 


■y^lap 


(53) 


Since a > 1/2, we have ^E = o (^) • Finally, applying Lemma A.2, there 

is a positive constant Cp such that for all n > 1, 


n^P 


E 



n 

2p- 


E^wi 



k=l 





= O 


1 

nP 


(54) 


We deduce from inequalities dSTl) to (l54l) . that for all integer p > 1, there is a positive con¬ 
stant Ap such that for all n > 1, 


E 


\Zn - m 


|2p' 


< 


Ap 


(55) 

□ 

Proof of Proposition 4.1. We now give a lower bound of E ||Z„ — m||^ . One can check that 
H ^k+i is the dominant term in decomposition (ISOll . Indeed, decomposition dSOl) can be 
written as 

Tm{Z„-nt) =+ (56) 


with 


k=l 




+ E 


Tl In f.—2 xlk T/c—1 / 

Applying inequalities dSTl) . ll52ll and (l5^ . one can check that 


rE 


IK. 


= 0 


(57) 


Moreover, 


E 


\^m (-Zn 


m] 


= 4e 

n2 


E^wi 

)c=l 


4e 

n2 


\Rn 


+ ^E 


E ^k+l,Rn 

A=1 / 


(58) 
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Applying Cauchy-Schwarz's inequality and Lemma 4.3, there is a positive constant Ci such 
that for all n >1, 


rE 


\k^l 


< 2E 


k=l 


\^n\ 


< 2 




1 

^E 


k=l 


rE 


\^n\ 


^2^ 1 
n Vn2® 


|Rti 


= 0 I - 
n 


Moreover, since E 


ll?n+l|r 


= 1 -E 


||0(Z„)f (see 


Cardot et al 


(I 2 OI 3 II for details). 


using the fact that (^„) is a sequence of martingale differences adapted to the filtration (Tn), 
we get 


1 

^E 


k=l 


” k=l k=lk'=k+l 


k=l 


k=l k'=k+l 


= S-^£E[||<|.(Zt)||- 


k=l 


Moreover, since ||^(Z„) || < C \\Zn — m\\, applying Theorem 4.1, we have. 


k=l 


2 n 




iZk - m\ 


k=l 

< 


k=l 


= 0 


Finally, 


E 


\^m {Zn 


m] 


1 fl 

— -h 0 I — 

n \n 


(59) 


Thus, since the largest eigenvalue of r„, satisfies A^ax < C, there is a rank tia such that for 
all n > n-a, 


E 


Let c" := min |mini<jt<n^ | 


kE 


\Zk - m 


Zn — in 
2 


> 


IC^n 

},^},for allu > 1, 


E 


Z„ — m 


>A 

n 


(60) 
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□ 


A.3 Proofs of Section 5 

Proof of Theorem 5.1. Tet f G (1/2,1) such that j6' < a. In order to apply Borel-Cantelli's 
Lemma, we will prove that 





< oo. 


(61) 


Applying Theorem 4.1, for all p > 1, for all n > 1, 


P \\Zn - mil > 


nh'/2 


< E 


I Z„ — m 


|2p 




< 


Kr. 


nP{oi-h’)' 


Since f' < a, we can take p and we get 

EP(||Z„-™||>^)<L;^<». 

Applying Borel-Cantelli's Lemma, 

/ _P\ 

||Z„ — m|| = O I n 2 I a.s, 
for all f>' < a. In a particular case, for all f < a, 

\\Zn — m\\ = 0 a.s. 


(62) 


(63) 

□ 


Proof of Corollary 5.1. Let us recall decomposition dSOll of the averaged algorithm: 


T-m ifZn m) — 


1 / Ti T 


_ 1 

n i 7i 


1 n+\ 


tl 


7k 7k-l 


k=l k=l , 


We will give the almost sure rate of convergence of each term. First, since Z\ is bounded. 


we have 


«7i 


= O (^) almost surely. Applying Theorem 5.1, let f>' < a, 


T 


n+l 


njn 


= 0 


n 2 

,1—a 


a.s 


= 0 


a.s. 


Indeed, we obtain the last equality by taking a. > f > la — 1, which is possible since a < 1. 
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Moreover, since 


7 ^ ^ < 2a.c^ let fi' < a., applying Theorem 5.1, 




k=2 


Ik 7k-l 


1 " 

<-Lm\ 


n 


k=l 


lk-1 Ik 


= 0 


a.s 


,cc-li'/2 


= 0 


a.s 


= 0 


a.s. 


Indeed, we get the last equality taking /3' > 2a. — 1. Moreover, since ||ti„|| < Q, ||Z„ — m||^, 
for all /I' < a, 


1 " 
/c=l 


<-Ell4|| 


n 


k=l 


< V £ liz» - "'ll' 

“ k=l 
(l " 


fl.S 


= 01^1 fl.S 


= 0 


a.s. 


Ind eed, we obtain the la st equality by taking a > fi' > 1/2. Finally, since E = n + o (n) 


Cardot et al. 


(|2ni5ll and proof of Theorem 4.2), applying the law of large numbers for 


(see 

martingales (see Theorem 1.3.15 in iDuflol (Il997h l. for all 3 > 0, 


If /(Inn)^ 

- E ^n+i = 0 P 


k=l 


a.s, 


(64) 


which concludes the proof. 


□ 


Remark A.l. Note that the law of large numbers for martingales in \Dufld (119971) is not given for 
general Hilbert spaces. Nevertheless, in our context, this law of large numbers can be extended. We 


just have to prove that for all positive constant 3,U„ := 


4 + 1 II converges almost 


\J n(ln(n))i+'* 

surely to a finite random variable. Since (4) Is a sequence of martingale differences adapted to the 
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filtration and since E ||^„+i ||^ \Fn 


< 1, 


E [Ul^,\Tn] = 


n(ln(n)) 


l+<5 




(n + l)(ln(n + 1))!+'^ ” (n + l)(ln(n + l))i+^ 


E 


Un+lf\j^n 


- (n + l)(ln(n + l))i+^' 


Thus, applying Robbins-Siegmund Theorem (see \Dufla (139971) ). (LZ,,) converges almost surely to a 
finite random variable, which concludes the proof. 
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